AITopics | span detection

Collaborating Authors

span detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Reason for Hallucination Span Detection

Su, Hsuan, Hu, Ting-Yao, Koppula, Hema Swetha, Krishna, Kundan, Pouransari, Hadi, Hsieh, Cheng-Yu, Koc, Cem, Cheng, Joseph Yitan, Tuzel, Oncel, Vemulapalli, Raviteja

arXiv.org Artificial IntelligenceOct-10-2025

Over the past few years, Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of natural language processing tasks (Xie et al., 2023; Zhang et al., 2023; Gao et al., 2024; OpenAI et al., 2024). However, they are still prone to generating hallucinations--content that is not supported by the input context or the underlying knowledge sources (Zhu et al., 2024; Kalai et al., 2025; Huang et al., 2025). Hallucinations pose critical risks in downstream applications such as summarization and long-form question answering, where reliability and factual consistency with respect to the input context are paramount. Hence, the ability to detect hallucinations is crucial for successful real-world deployment of LLMs. Most existing research works focus on binary hallucination detection problem, where the goal is to determine if the model output contains hallucinations or not (Yang et al., 2024a,b; Tang et al., 2024; Ravi et al., 2024; Ji et al., 2024; Chuang et al., 2024). While useful, this formulation is limited: in many real-world applications, one often needs to know which specific spans in the model output are hallucinated in order to assess the reliability of the generated content. This motivates the problem of hallucination span detection, where the goal is to precisely locate unsupported content in the model output (Wu et al., 2023; Ogasa and Arase, 2025). Recently, reasoning--the process of systematically arriving at conclusions by generating and utilizing intermediate steps--has been shown to significantly enhance the capabilities of LLMs in solving complex tasks such as mathematics (Shao et al., 2024; Yu et al., 2025) and coding (Liu and Zhang, 2025; Chen et al., 2025). Hallucination span detection is also a complex multi-step decision making process as it requires carefully analyzing the model output to extract all the stated facts and verifying whether each of these facts is fully supported by the input context, and could benefit significantly from a learned reasoning process.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.02173

Country:

Asia (0.93)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

ViToSA: Audio-Based Toxic Spans Detection on Vietnamese Speech Utterances

Do, Huy Ba, Huynh, Vy Le-Phuong, Nguyen, Luan Thanh

arXiv.org Artificial IntelligenceJun-3-2025

Toxic speech on online platforms is a growing concern, impacting user experience and online safety. While text-based toxicity detection is well-studied, audio-based approaches remain underexplored, especially for low-resource languages like Vietnamese. This paper introduces ViToSA (Vietnamese Toxic Spans Audio), the first dataset for toxic spans detection in Vietnamese speech, comprising 11,000 audio samples (25 hours) with accurate human-annotated transcripts. We propose a pipeline that combines ASR and toxic spans detection for fine-grained identification of toxic content. Our experiments show that fine-tuning ASR models on ViToSA significantly reduces WER when transcribing toxic speech, while the text-based toxic spans detection (TSD) models outperform existing baselines. These findings establish a novel benchmark for Vietnamese audio-based toxic spans detection, paving the way for future research in speech content moderation.

detection, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.00636

Country:

Asia (1.00)
North America > Mexico (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.70)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.31)

Add feedback

Hidden Persuasion: Detecting Manipulative Narratives on Social Media During the 2022 Russian Invasion of Ukraine

Akhynko, Kateryna, Kosovan, Oleksandr, Trokhymovych, Mykola

arXiv.org Artificial IntelligenceJun-2-2025

This paper presents one of the top-performing solutions to the UNLP 2025 Shared Task on Detecting Manipulation in Social Media. The task focuses on detecting and classifying rhetorical and stylistic manipulation techniques used to influence Ukrainian Telegram users. For the classification subtask, we fine-tuned the Gemma 2 language model with LoRA adapters and applied a second-level classifier leveraging meta-features and threshold optimization. For span detection, we employed an XLM-RoBERTa model trained for multi-target, including token binary classification. Our approach achieved 2nd place in classification and 3rd place in span detection.

machine learning, manipulation technique, natural language, (16 more...)

arXiv.org Artificial Intelligence

2505.24028

Country:

Europe > Ukraine (0.51)
Asia > Russia (0.41)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Military (0.50)
Government > Regional Government > Europe Government > Russia Government (0.41)
Government > Regional Government > Asia Government > Russia Government (0.41)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition

Guo, Quanjiang, Dong, Yihong, Tian, Ling, Kang, Zhao, Zhang, Yu, Wang, Sijie

arXiv.org Artificial IntelligenceDec-3-2024

Despite the recent success of two-stage prototypical networks in few-shot named entity recognition (NER), challenges such as over/under-detected false spans in the span detection stage and unaligned entity prototypes in the type classification stage persist. Additionally, LLMs have not proven to be effective few-shot information extractors in general. In this paper, we propose an approach called Boundary-Aware LLMs for Few-Shot Named Entity Recognition to address these issues. We introduce a boundary-aware contrastive learning strategy to enhance the LLM's ability to perceive entity boundaries for generalized entity spans. Additionally, we utilize LoRAHub to align information from the target domain to the source domain, thereby enhancing adaptive cross-domain classification capabilities. Extensive experiments across various benchmarks demonstrate that our framework outperforms prior methods, validating its effectiveness. In particular, the proposed strategies demonstrate effectiveness across a range of LLM architectures. The code and data are released on https://github.com/UESTC-GQJ/BANER.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.02228

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Singapore (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre: Research Report (0.64)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bangla Grammatical Error Detection Leveraging Transformer-based Token Classification

Islam, Shayekh Bin, Tanvir, Ridwanul Hasan, Afnan, Sihat

arXiv.org Artificial IntelligenceNov-13-2024

Bangla is the seventh most spoken language by a total number of speakers in the world, and yet the development of an automated grammar checker in this language is an understudied problem. Bangla grammatical error detection is a task of detecting sub-strings of a Bangla text that contain grammatical, punctuation, or spelling errors, which is crucial for developing an automated Bangla typing assistant. Our approach involves breaking down the task as a token classification problem and utilizing state-of-the-art transformer-based models. Finally, we combine the output of these models and apply rule-based post-processing to generate a more reliable and comprehensive result. Our system is evaluated on a dataset consisting of over 25,000 texts from various sources. Our best model achieves a Levenshtein distance score of 1.04. Finally, we provide a detailed analysis of different components of our system.

arxiv, detection, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2411.08344

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > India > Maharashtra > Mumbai (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MedDec: A Dataset for Extracting Medical Decisions from Discharge Summaries

Elgaar, Mohamed, Cheng, Jiali, Vakil, Nidhi, Amiri, Hadi, Celi, Leo Anthony

arXiv.org Artificial IntelligenceAug-23-2024

Medical decisions directly impact individuals' health and well-being. Extracting decision spans from clinical notes plays a crucial role in understanding medical decision-making processes. In this paper, we develop a new dataset called "MedDec", which contains clinical notes of eleven different phenotypes (diseases) annotated by ten types of medical decisions. We introduce the task of medical decision extraction, aiming to jointly extract and classify different types of medical decisions within clinical notes. We provide a comprehensive analysis of the dataset, develop a span detection model as a baseline for this task, evaluate recent span detection approaches, and employ a few metrics to measure the complexity of data samples. Our findings shed light on the complexities inherent in clinical decision extraction and enable future work in this area of research. The dataset and code are available through https://github.com/CLU-UML/MedDec.

computational linguistic, medical decision, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2408.1298

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
(16 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.76)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Abusive Span Detection for Vietnamese Narrative Texts

Nguyen, Nhu-Thanh, Phan, Khoa Thi-Kim, Nguyen, Duc-Vu, Nguyen, Ngan Luu-Thuy

arXiv.org Artificial IntelligenceDec-12-2023

Abuse in its various forms, including physical, psychological, verbal, sexual, financial, and cultural, has a negative impact on mental health. However, there are limited studies on applying natural language processing (NLP) in this field in Vietnam. Therefore, we aim to contribute by building a human-annotated Vietnamese dataset for detecting abusive content in Vietnamese narrative texts. We sourced these texts from VnExpress, Vietnam's popular online newspaper, where readers often share stories containing abusive content. Identifying and categorizing abusive spans in these texts posed significant challenges during dataset creation, but it also motivated our research. We experimented with lightweight baseline models by freezing PhoBERT and XLM-RoBERTa and using their hidden states in a BiLSTM to assess the complexity of the dataset. According to our experimental results, PhoBERT outperforms other models in both labeled and unlabeled abusive span detection tasks. These results indicate that it has the potential for future improvements.

dataset, detection, span detection, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3628797.3628921

2312.07831

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.05)
Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)
(8 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.66)
Media > News (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Unify word-level and span-level tasks: NJUNLP's Participation for the WMT2023 Quality Estimation Shared Task

Geng, Xiang, Lai, Zhejian, Zhang, Yu, Tao, Shimin, Yang, Hao, Chen, Jiajun, Huang, Shujian

arXiv.org Artificial IntelligenceDec-11-2023

We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. Our team submitted predictions for the English-German language pair on all two sub-tasks: (i) sentence- and word-level quality prediction; and (ii) fine-grained error span detection. This year, we further explore pseudo data methods for QE based on NJUQE framework (https://github.com/NJUNLP/njuqe). We generate pseudo MQM data using parallel data from the WMT translation task. We pre-train the XLMR large model on pseudo QE data, then fine-tune it on real QE data. At both stages, we jointly learn sentence-level scores and word-level tags. Empirically, we conduct experiments to find the key hyper-parameters that improve the performance. Technically, we propose a simple method that covert the word-level outputs to fine-grained error span results. Overall, our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks by a considerable margin.

pseudo mqm data, severity, translation, (12 more...)

arXiv.org Artificial Intelligence

2309.1323

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
Asia > China > Jiangsu Province > Nanjing (0.05)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Overview of the HASOC Subtrack at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span Detection

Masud, Sarah, Khan, Mohammad Aflah, Akhtar, Md. Shad, Chakraborty, Tanmoy

arXiv.org Artificial IntelligenceNov-16-2023

As hate speech continues to proliferate on the web, it is becoming increasingly important to develop computational methods to mitigate it. Reactively, using black-box models to identify hateful content can perplex users as to why their posts were automatically flagged as hateful. On the other hand, proactive mitigation can be achieved by suggesting rephrasing before a post is made public. However, both mitigation techniques require information about which part of a post contains the hateful aspect, i.e., what spans within a text are responsible for conveying hate. Better detection of such spans can significantly reduce explicitly hateful content on the web. To further contribute to this research area, we organized HateNorm at HASOC-FIRE 2023, focusing on explicit span detection in English Tweets. A total of 12 teams participated in the competition, with the highest macro-F1 observed at 0.58.

detection, proceedings, span, (15 more...)

arXiv.org Artificial Intelligence

2311.09834

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre: Research Report (0.40)

Industry: Transportation (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task

Rei, Ricardo, Guerreiro, Nuno M., Pombal, José, van Stigt, Daan, Treviso, Marcos, Coheur, Luisa, de Souza, José G. C., Martins, André F. T.

arXiv.org Artificial IntelligenceSep-21-2023

We present the joint contribution of Unbabel and Instituto Superior T\'ecnico to the WMT 2023 Shared Task on Quality Estimation (QE). Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2). For all tasks, we build on the COMETKIWI-22 model (Rei et al., 2022b). Our multilingual approaches are ranked first for all tasks, reaching state-of-the-art performance for quality estimation at word-, span- and sentence-level granularity. Compared to the previous state-of-the-art COMETKIWI-22, we show large improvements in correlation with human judgements (up to 10 Spearman points). Moreover, we surpass the second-best multilingual submission to the shared-task with up to 3.8 absolute points.

language pair, prediction, span detection, (13 more...)

arXiv.org Artificial Intelligence

2309.11925

Country:

Europe > Portugal > Lisbon > Lisbon (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)

Add feedback